chore: load network accounts asynchronously in NTX Builder #1495

SantiagoPittella · 2026-01-09T15:47:12Z

closes #1487

depends on #1453 and #1501 to fully work.

crates/ntx-builder/src/store.rs

crates/ntx-builder/src/builder.rs

crates/ntx-builder/src/store.rs

crates/ntx-builder/src/builder.rs

Mirko-von-Leipzig · 2026-01-14T09:22:45Z

crates/ntx-builder/src/builder.rs

+        tokio::spawn(async move {
+            if let Err(err) = account_loader_store.stream_network_account_ids(account_tx).await {
+                tracing::error!(%err, "failed to load network accounts from store");


Another question is whether we should be reacting to task failure here. As in, should we abort the ntx builder (or restart the task) if it fails? Or is logging a single error good enough.

Failure here would mean that we never load accounts with existing notes from the store unless they also get a new event coming in. So I think we should abort.

I went for the abort approach

Mirko-von-Leipzig · 2026-01-14T09:29:45Z

crates/ntx-builder/src/store.rs

+                Err(err) => {
+                    // Exponential backoff with base 500ms and max 30s.
+                    let backoff = Duration::from_millis(500)
+                        .saturating_mul(1 << retry_counter)


This will fail/rollover after.. 32 or 64 retries 😁 Also need to saturating exponential (or limit counter I guess).

I fixed this here, and in some other places where we were using this pattern

Mirko-von-Leipzig · 2026-01-14T09:50:47Z

crates/ntx-builder/src/store.rs

+    /// This method is designed to be run in a background task, sending accounts to the main event
+    /// loop as they are loaded. This allows the ntx-builder to start processing mempool events
+    /// without waiting for all accounts to be preloaded.
+    #[instrument(target = COMPONENT, name = "store.client.load_committed_accounts", skip_all, err)]


There's a bit of a problem with the instrumentation as is. One needs to be careful with long-running tasks and traces. This is because a trace is only complete once the root span completes.

In this case, what we'll have is:

start load_committed_accounts fetch_page(0) submit_page(0) fetch_page(1) submit_page(1) ... ... ... a long, long, long time later fetch_page(chain tip) submit_page(chain tip) close load_committed_accounts

Instead we shouldn't instrument this method at all, and each iteration of the loop should be its own root span. This means you'll want to reshuffle things a bit.

See here for an example with retries. The first example is what we have per each loop iteration. Let me know if this is still ambiguous.

Ok, perfect. I will check the network-monitor for this too and open an issue if I find something.

Co-authored-by: Mirko <[email protected]>

sergerad · 2026-01-19T21:42:40Z

crates/ntx-builder/src/store.rs

+        for account in accounts {
+            // If the receiver is dropped, stop loading.
+            if sender.send(account).await.is_err() {
+                return Ok(());


Might be worth an error/warn log here

crates/ntx-builder/src/block_producer.rs

Mirko-von-Leipzig · 2026-01-20T12:19:58Z

crates/ntx-builder/src/builder.rs

+                    }
+                } => {
+                    account_loader = None;
+                    result.context("account loader task panicked")??;


We have flatten from an error extension trait

Suggested change

result.context("account loader task panicked")??;

result.context("account loader task panicked").flatten()?;

Mirko-von-Leipzig · 2026-01-20T12:31:47Z

crates/ntx-builder/src/builder.rs

+                Some(result) = async {
+                    match account_loader.as_mut() {
+                        Some(handle) => Some(handle.await),
+                        None => std::future::pending().await,
+                    }
+                } => {


An alternative to Option<Handle> is to reassign the handle itself upon completion, something like

let mut account_loader_handle = tokio::spawn(...); // ... select! { result = account_loader_handle => { result .context("panicked while loading accounts from store") .flatten()?; tracing::info!("account loading from store completed"); account_loader_handle = std::future::pending(); } }

Mirko-von-Leipzig · 2026-01-20T12:39:16Z

crates/ntx-builder/src/store.rs

+        tracing::Span::current()
+            .record("chain_tip", pagination_info.chain_tip)
+            .record("current_height", pagination_info.block_num);


This is the info for the next page or no? I would ignore this here since it will get recorded with its page on the next loop.

Mirko-von-Leipzig · 2026-01-20T12:40:21Z

crates/ntx-builder/src/store.rs

+        skip(self, accounts, sender),
+        fields(chain_tip = chain_tip, current_height = current_height)


I think its okay to log this at the root span only

Suggested change

skip(self, accounts, sender),

fields(chain_tip = chain_tip, current_height = current_height)

skip_all

Mirko-von-Leipzig · 2026-01-20T12:45:02Z

crates/ntx-builder/src/store.rs

+        block_range: RangeInclusive<BlockNumber>,
+        sender: &tokio::sync::mpsc::Sender<NetworkAccountPrefix>,
+    ) -> Result<Option<BlockNumber>, StoreError> {
+        let (accounts, pagination_info) = self.fetch_page(block_range).await?;


Unfortunately tracing will only log the error's Display impl, while we usually want the full error report.

So instead we have to do this manually :*(

And the same for the error lower down.

Suggested change

let (accounts, pagination_info) = self.fetch_page(block_range).await?;

let (accounts, pagination_info) = self.fetch_page(block_range)

.await

.inspect_err(|err| tracing::Span::curret().set_error(err))?;

A somewhat better way (to prevent missing this) is to construct an async chain so that its all handled in one location, an example

miden-node/crates/block-producer/src/block_builder/mod.rs

Lines 136 to 155 in 2cc9742

self.get_block_inputs(selected)

.inspect_ok(BlockBatchesAndInputs::inject_telemetry)

.and_then(|inputs| self.propose_block(inputs))

.inspect_ok(|(proposed_block, _)| {

ProposedBlock::inject_telemetry(proposed_block);

})

.and_then(|(proposed_block, inputs)| self.validate_block(proposed_block, inputs))

.and_then(|(proposed_block, inputs, header, signature, body)| self.prove_block(proposed_block, inputs, header, signature, body))

.inspect_ok(ProvenBlock::inject_telemetry)

// Failure must be injected before the final pipeline stage i.e. before commit is called. The system cannot

// handle errors after it considers the process complete (which makes sense).

.and_then(|proven_block| async { self.inject_failure(proven_block) })

.and_then(|proven_block| self.commit_block(mempool, proven_block))

// Handle errors by propagating the error to the root span and rolling back the block.

.inspect_err(|err| Span::current().set_error(err))

.or_else(|err| async {

self.rollback_block(mempool, block_num).await;

Err(err)

})

.await

Though this may not be a good fit depending on how data is broken up.

chore: decouple ntx builder

5267cbf

SantiagoPittella force-pushed the santiagopittella-decouple-ntx-builder branch from 53330ae to 5267cbf Compare January 12, 2026 14:43

SantiagoPittella marked this pull request as ready for review January 12, 2026 15:05

SantiagoPittella requested a review from Mirko-von-Leipzig January 12, 2026 15:05

Mirko-von-Leipzig reviewed Jan 12, 2026

View reviewed changes

SantiagoPittella added 7 commits January 12, 2026 17:18

review: send one account at a time

2c0a1b2

review: increase channel capacity, remove pagination reference

9ff989d

review: return NetworkAccountPrefix

cd08232

review: improve tracing

5a58026

review: add exponential backoff

05909a9

review: remove max iterations

f15f313

Merge branch 'next' into santiagopittella-decouple-ntx-builder

05a1a0f

Mirko-von-Leipzig reviewed Jan 14, 2026

View reviewed changes

SantiagoPittella and others added 4 commits January 14, 2026 11:58

review: reduce channel capacity

3103972

Co-authored-by: Mirko <[email protected]>

review: abort on failure

ec65239

fix: exponential backoff overflow

bf5a7b8

review: fix instrumentation

c5e8c7d

SantiagoPittella mentioned this pull request Jan 14, 2026

fix: remove parent trace #1515

Merged

SantiagoPittella requested review from Mirko-von-Leipzig, drahnr and sergerad January 14, 2026 18:19

sergerad reviewed Jan 19, 2026

View reviewed changes

sergerad changed the title ~~chore: decouple ntx builder~~ chore: load network accounts asynchronously in NTX Builder Jan 19, 2026

sergerad approved these changes Jan 19, 2026

View reviewed changes

bobbinth reviewed Jan 20, 2026

View reviewed changes

crates/ntx-builder/src/block_producer.rs Show resolved Hide resolved

Mirko-von-Leipzig mentioned this pull request Jan 20, 2026

Use a dedicated retry crate #1550

Open

Mirko-von-Leipzig reviewed Jan 20, 2026

View reviewed changes

	result.context("account loader task panicked")??;
	result.context("account loader task panicked").flatten()?;

		skip(self, accounts, sender),
		fields(chain_tip = chain_tip, current_height = current_height)

	skip(self, accounts, sender),
	fields(chain_tip = chain_tip, current_height = current_height)
	skip_all

	self.get_block_inputs(selected)
	.inspect_ok(BlockBatchesAndInputs::inject_telemetry)
	.and_then(\|inputs\| self.propose_block(inputs))
	.inspect_ok(\|(proposed_block, _)\| {
	ProposedBlock::inject_telemetry(proposed_block);
	})
	.and_then(\|(proposed_block, inputs)\| self.validate_block(proposed_block, inputs))
	.and_then(\|(proposed_block, inputs, header, signature, body)\| self.prove_block(proposed_block, inputs, header, signature, body))
	.inspect_ok(ProvenBlock::inject_telemetry)
	// Failure must be injected before the final pipeline stage i.e. before commit is called. The system cannot
	// handle errors after it considers the process complete (which makes sense).
	.and_then(\|proven_block\| async { self.inject_failure(proven_block) })
	.and_then(\|proven_block\| self.commit_block(mempool, proven_block))
	// Handle errors by propagating the error to the root span and rolling back the block.
	.inspect_err(\|err\| Span::current().set_error(err))
	.or_else(\|err\| async {
	self.rollback_block(mempool, block_num).await;
	Err(err)
	})
	.await

chore: load network accounts asynchronously in NTX Builder #1495

Are you sure you want to change the base?

chore: load network accounts asynchronously in NTX Builder #1495

Conversation

SantiagoPittella commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

SantiagoPittella commented Jan 9, 2026 •

edited

Loading